Gwynedd
Clinical knowledge in LLMs does not translate to human interactions
Bean, Andrew M., Payne, Rebecca, Parsons, Guy, Kirk, Hannah Rose, Ciro, Juan, Mosquera, Rafael, Monsalve, Sara Hincapié, Ekanayaka, Aruna S., Tarassenko, Lionel, Rocher, Luc, Mahdi, Adam
Global healthcare providers are exploring use of large language models (LLMs) to provide medical advice to the public. LLMs now achieve nearly perfect scores on medical licensing exams, but this does not necessarily translate to accurate performance in real-world settings. We tested if LLMs can assist members of the public in identifying underlying conditions and choosing a course of action (disposition) in ten medical scenarios in a controlled study with 1,298 participants. Participants were randomly assigned to receive assistance from an LLM (GPT-4o, Llama 3, Command R+) or a source of their choice (control). Tested alone, LLMs complete the scenarios accurately, correctly identifying conditions in 94.9% of cases and disposition in 56.3% on average. However, participants using the same LLMs identified relevant conditions in less than 34.5% of cases and disposition in less than 44.2%, both no better than the control group. We identify user interactions as a challenge to the deployment of LLMs for medical advice. Standard benchmarks for medical knowledge and simulated patient interactions do not predict the failures we find with human participants. Moving forward, we recommend systematic human user testing to evaluate interactive capabilities prior to public deployments in healthcare.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Asia > Middle East > Saudi Arabia (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (4 more...)
- Research Report > Strength High (1.00)
- Research Report > Experimental Study (1.00)
A Hierarchical Architecture for Human-Robot Cooperation Processes
Darvish, Kourosh, Simetti, Enrico, Mastrogiovanni, Fulvio, Casalino, Giuseppe
In this paper we propose FlexHRC+, a hierarchical human-robot cooperation architecture designed to provide collaborative robots with an extended degree of autonomy when supporting human operators in high-variability shop-floor tasks. The architecture encompasses three levels, namely for perception, representation, and action. Building up on previous work, here we focus on (i) an in-the-loop decision making process for the operations of collaborative robots coping with the variability of actions carried out by human operators, and (ii) the representation level, integrating a hierarchical AND/OR graph whose online behaviour is formally specified using First Order Logic. The architecture is accompanied by experiments including collaborative furniture assembly and object positioning tasks.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
- (24 more...)
Bounds for the VC Dimension of 1NN Prototype Sets
Gunn, Iain A. D., Kuncheva, Ludmila I.
In Statistical Learning, the Vapnik-Chervonenkis (VC) dimension is an important combinatorial property of classifiers. To our knowledge, no theoretical results yet exist for the VC dimension of edited nearest-neighbour (1NN) classifiers with reference set of fixed size. Related theoretical results are scattered in the literature and their implications have not been made explicit. We collect some relevant results and use them to provide explicit lower and upper bounds for the VC dimension of 1NN classifiers with a prototype set of fixed size. We discuss the implications of these bounds for the size of training set needed to learn such a classifier to a given accuracy. Further, we provide a new lower bound for the two-dimensional case, based on a new geometrical argument.
- North America > United States > New York (0.04)
- North America > United States > Texas (0.04)
- Europe > United Kingdom > Wales > Gwynedd (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Instance Selection Improves Geometric Mean Accuracy: A Study on Imbalanced Data Classification
Kuncheva, Ludmila I., Arnaiz-González, Álvar, Díez-Pastor, José-Francisco, Gunn, Iain A. D.
A natural way of handling imbalanced data is to attempt to equalise the class frequencies and train the classifier of choice on balanced data. For two-class imbalanced problems, the classification success is typically measured by the geometric mean (GM) of the true positive and true negative rates. Here we prove that GM can be improved upon by instance selection, and give the theoretical conditions for such an improvement. We demonstrate that GM is non-monotonic with respect to the number of retained instances, which discourages systematic instance selection. We also show that balancing the distribution frequencies is inferior to a direct maximisation of GM. To verify our theoretical findings, we carried out an experimental study of 12 instance selection methods for imbalanced data, using 66 standard benchmark data sets. The results reveal possible room for new instance selection methods for imbalanced data.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Wisconsin (0.04)
- North America > United States > Texas (0.04)
- (5 more...)
- Information Technology > Data Science > Data Quality > Instance Selection (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Constituent Grammatical Evolution
Georgiou, Loukas (Bangor University) | Teahan, William J. (Bangor University)
We present Constituent Grammatical Evolution (CGE), a new evolutionary automatic programming algorithm that extends the standard Grammatical Evolution algorithm by incorporating the concepts of constituent genes and conditional behaviour-switching. CGE builds from elementary and more complex building blocks a control program which dictates the behaviour of an agent and it is applicable to the class of problems where the subject of search is the behaviour of an agent in a given environment. It takes advantage of the powerful Grammatical Evolution feature of using a BNF grammar definition as a plug-in component to describe the output language to be produced by the system. The main benchmark problem in which CGE is evaluated is the Santa Fe Trail problem using a BNF grammar definition which defines a search space semantically equivalent with that of the original definition of the problem by Koza. Furthermore, CGE is evaluated on two additional problems, the Los Altos Hills and the Hampton Court Maze. The experimental results demonstrate that Constituent Grammatical Evolution outperforms the standard Grammatical Evolution algorithm in these problems, in terms of both efficiency (percent of solutions found) and effectiveness (number of required steps of solutions found).
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Illinois > Cook County > Evanston (0.04)
- Europe > United Kingdom > Wales > Gwynedd > Bangor (0.04)
- (5 more...)
Calendar of Events
The seventh biennial Bar-Ilan International Symposium on the Foundations of Artificial Intelligence, will be held on June 25-27, 2001 in Ramat Gan, Israel. The meeting will honor the research and accomplishments of Yaacov Choueka and will therefore place special emphasis on natural language processing and computational linguistics, in addition to the usual topics of the symposium. Yaacov Choueka Jieh Hsiang Daphne Koller Richard Korf Doug Lenat Moshe Vardi The BISFAI-01 program, schedule and registration information will be available at the BISFAI website: www.cs.biu.ac.il/ bisfai, along with abstracts of invited and accepted papers and pointers to online versions.For further information or requests, contact: bisfai@cs.biu.ac.il. CONTEXT-01 EST Setubal, Campus do IPS / R. Vale www.dfki.de/um2001 Faculty Positions for Intelligent Aerospace Systems Program The College of Engineering at the University of Oklahoma invites applications for 3 to 5 new faculty positions at all levels in the area of Intelligent Systems.
- North America > United States > Oklahoma > Cleveland County > Norman (0.29)
- Europe > Portugal > Setubal > Setubal (0.26)
- Europe > Austria > Vienna (0.17)
- (34 more...)
- Education (1.00)
- Aerospace & Defense (0.71)
- Transportation > Air (0.35)